As natural language processing (NLP) for gender bias becomes a significant interdisciplinary topic, the prevalent data-driven techniques such as large-scale language models suffer from data inadequacy and biased corpus, especially for languages with insufficient resources such as Chinese. To this end, we propose a Chinese cOrpus foR Gender bIas Probing and Mitigation CORGI-PM, which contains 32.9k sentences with high-quality labels derived by following an annotation scheme specifically developed for gender bias in the Chinese context. Moreover, we address three challenges for automatic textual gender bias mitigation, which requires the models to detect, classify, and mitigate textual gender bias. We also conduct experiments with state-of-the-art language models to provide baselines. To our best knowledge, CORGI-PM is the first sentence-level Chinese corpus for gender bias probing and mitigation.
translated by 谷歌翻译
Multi-modal fusion is a basic task of autonomous driving system perception, which has attracted many scholars' interest in recent years. The current multi-modal fusion methods mainly focus on camera data and LiDAR data, but pay little attention to the kinematic information provided by the bottom sensors of the vehicle, such as acceleration, vehicle speed, angle of rotation. These information are not affected by complex external scenes, so it is more robust and reliable. In this paper, we introduce the existing application fields of vehicle bottom information and the research progress of related methods, as well as the multi-modal fusion methods based on bottom information. We also introduced the relevant information of the vehicle bottom information data set in detail to facilitate the research as soon as possible. In addition, new future ideas of multi-modal fusion technology for autonomous driving tasks are proposed to promote the further utilization of vehicle bottom information.
translated by 谷歌翻译
在本文中,我们提出了一种快速的单眼深度估计方法,用于启用低成本水下机器人的3D感知能力。我们制定了一种名为udepth的新型端到端深度视觉学习管道,该管道结合了自然水下场景的图像形成特征的领域知识。首先,我们通过利用水下光线衰减来调整新的输入空间,然后在粗像素深度预测中设计最小二乘配方。随后,我们将其扩展到一个域投影损失,该损失指导超过9K RGB-D训练样本的Udepth的端到端学习。 Udepth采用计算轻型MobilenETV2骨架和基于变压器的优化器设计,以确保嵌入式系统上的快速推理速率。通过域感知的设计选择并通过全面的实验分析,我们证明了可以在确保较小的计算足迹的同时实现最新的深度估计性能。具体而言,与现有基准相比,网络参数少70%-80%,Udepth实现了可比性的,并且通常更高的深度估计性能。虽然完整的模型在单个GPU(CPU核心)上提供了超过66 fps(13 fps)的推理率,但我们对粗深度预测的域投影在单板NVIDIA JETSON TX2S上以51.5 fps的速率运行。推理管道可在https://github.com/uf-robopi/udepth上找到。
translated by 谷歌翻译
许多最新的自然语言任务方法都建立在大型语言模型的非凡能力上。大型语言模型可以执行内在的学习,他们可以从几个任务演示中学习新任务,而无需任何参数更新。这项工作研究了对新自然语言任务的数据集创建数据集的含义。与最近的文化学习方法背道而驰,我们制定了一个注释效率的两步框架:选择性注释,选择一个示例池,以提前从未标记的数据中从未标记的数据中进行注释,然后及时检索从注释的池中检索任务示例测试时间。基于此框架,我们提出了一种无监督的,基于图的选择性注释方法VOKE-K,以选择各种代表性的示例进行注释。在10个数据集上进行了广泛的实验(涵盖分类,常识性推理,对话和文本/代码生成)表明,我们的选择性注释方法通过很大的利润提高了任务性能。与随机选择示例进行注释相比,Pote-K平均在注释预算下获得了12.9%/11.4%的相对增益。与最先进的监督登录方法相比,它的性能相似,而在10个任务中的注释成本降低了10-100倍。我们在各种情况下进一步分析了框架的有效性:具有不同大小的语言模型,替代选择性注释方法以及有测试数据域移动的情况。我们希望我们的研究将作为数据注释的基础,因为大型语言模型越来越多地应用于新任务。我们的代码可在https://github.com/hkunlp/icl-selactive-annotation上找到。
translated by 谷歌翻译
乳腺癌是女性癌症死亡的主要原因之一。作为乳房筛查的主要输出,乳房超声(US)视频包含用于癌症诊断的独家动态信息。但是,视频分析的培训模型是不平凡的,因为它需要一个大量的数据集,而注释也很昂贵。此外,乳房病变的诊断面临着独特的挑战,例如类间相似性和阶层内变异。在本文中,我们提出了一种开创性的方法,该方法直接利用了计算机辅助乳腺癌诊断中的视频。它利用掩盖的视频建模作为预防性的,以减少对数据集大小和详细注释的依赖。此外,开发了相关性的对比损失,以促进良性和恶性病变之间内部和外部关系的识别。实验结果表明,我们提出的方法实现了有希望的分类性能,并且可以超越其他最先进的方法。
translated by 谷歌翻译
基于深入的学习的断层摄影图像重建一直在这些年来引起了很多关注。稀疏视图数据重建是典型的未确定逆问题之一,如何从数十个投影重建高质量CT图像仍然是实践中的挑战。为了解决这一挑战,在本文中,我们提出了一个多域一体化的Swin变压器网络(MIST-NET)。首先,使用灵活的网络架构,所提出的雾网掺入了来自数据,残差数据,图像和剩余图像的豪华域特征。这里,残差数据和残差 - 图像域网组件可以被认为是数据一致性模块,以消除残差数据和图像域中的插值误差,然后进一步保持图像细节。其次,为了检测图像特征和进一步保护图像边缘,将培训的Sobel滤波器结合到网络中以提高编码解码能力。第三,随着经典的Swin变压器,我们进一步设计了高质量的重建变压器(即,REFFORMER)来提高重建性能。 REFFORMER继承了SWIN变压器的功率以捕获重建图像的全局和本地特征。具有48种视图的数值数据集的实验证明了我们所提出的雾网提供更高的重建图像质量,具有小的特征恢复和边缘保护,而不是其他竞争对手,包括高级展开网络。定量结果表明,我们的雾网也获得了最佳性能。训练有素的网络被转移到真实的心脏CT数据集,48次视图,重建结果进一步验证了我们的雾网的优势,进一步证明了临床应用中雾的良好稳健性。
translated by 谷歌翻译
对比学习在各种高级任务中取得了显着的成功,但是为低级任务提出了较少的方法。采用VANILLA对比学习技术采用直接为低级视觉任务提出的VANILLA对比度学习技术,因为所获得的全局视觉表现不足以用于需要丰富的纹理和上下文信息的低级任务。在本文中,我们提出了一种用于单图像超分辨率(SISR)的新型对比学习框架。我们从两个视角调查基于对比的学习的SISR:样品施工和特征嵌入。现有方法提出了一些天真的样本施工方法(例如,考虑到作为负样本的低质量输入以及作为正样品的地面真理),并且它们采用了先前的模型(例如,预先训练的VGG模型)来获得该特征嵌入而不是探索任务友好的。为此,我们向SISR提出了一个实用的对比学习框架,涉及在频率空间中产生许多信息丰富的正负样本。我们不是利用其他预先训练的网络,我们设计了一种从鉴别器网络继承的简单但有效的嵌入网络,并且可以用主SR网络迭代优化,使其成为任务最通报。最后,我们对我们的方法进行了广泛的实验评估,与基准方法相比,在目前的最先进的SISR方法中显示出高达0.21 dB的显着增益。
translated by 谷歌翻译
Forward-Looking Sonar (FLS) has started to gain attention in the field of near-bottom close-range underwater inspection because of its high resolution and high framerate features. Although Automatic Target Recognition (ATR) algorithms have been applied tentatively for object-searching tasks, human supervision is still indispensable, especially when involving critical areas. A clear FLS mosaic containing all suspicious information is in demand to help experts deal with tremendous perception data. However, previous work only considered that FLS is working in an ideal system configuration, which assumes an appropriate sonar imaging setup and the availability of accurate positioning data. Without those promises, the intra-frame and inter-frame artifacts will appear and degrade the quality of the final mosaic by making the information of interest invisible. In this paper, we propose a novel blending method for FLS mosaicing which can preserve interested information. A Long-Short Time Sliding Window (LST-SW) is designed to rectify the local statistics of raw sonar images. The statistics are then utilized to construct a Global Variance Map (GVM). The GVM helps to emphasize the useful information contained in images in the blending phase by classifying the informative and featureless pixels, thereby enhancing the quality of final mosaic. The method is verified using data collected in the real environment. The results show that our method can preserve more details in FLS mosaics for human inspection purposes in practice.
translated by 谷歌翻译
This paper introduced key aspects of applying Machine Learning (ML) models, improved trading strategies, and the Quasi-Reversibility Method (QRM) to optimize stock option forecasting and trading results. It presented the findings of the follow-up project of the research "Application of Convolutional Neural Networks with Quasi-Reversibility Method Results for Option Forecasting". First, the project included an application of Recurrent Neural Networks (RNN) and Long Short-Term Memory (LSTM) networks to provide a novel way of predicting stock option trends. Additionally, it examined the dependence of the ML models by evaluating the experimental method of combining multiple ML models to improve prediction results and decision-making. Lastly, two improved trading strategies and simulated investing results were presented. The Binomial Asset Pricing Model with discrete time stochastic process analysis and portfolio hedging was applied and suggested an optimized investment expectation. These results can be utilized in real-life trading strategies to optimize stock option investment results based on historical data.
translated by 谷歌翻译
This paper presents a methodology for combining programming and mathematics to optimize elevator wait times. Based on simulated user data generated according to the canonical three-peak model of elevator traffic, we first develop a naive model from an intuitive understanding of the logic behind elevators. We take into consideration a general array of features including capacity, acceleration, and maximum wait time thresholds to adequately model realistic circumstances. Using the same evaluation framework, we proceed to develop a Deep Q Learning model in an attempt to match the hard-coded naive approach for elevator control. Throughout the majority of the paper, we work under a Markov Decision Process (MDP) schema, but later explore how the assumption fails to characterize the highly stochastic overall Elevator Group Control System (EGCS).
translated by 谷歌翻译